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Abstract 

As a greedy algorithm to recover sparse signals from compressed measurements, the orthogonal 
matching pursuit (OMP) algorithm has received much attention in recent years. In this paper, we 
introduce an extension of the orthogonal matching pursuit (gOMP) for pursuing efficiency in recon- 
structing sparse signals. Our approach, henceforth referred to as generalized OMP (gOMP), is literally 
a generalization of the OMP in the sense that multiple indices are identified per iteration. Owing to 
the selection of multiple "correct" indices, the gOMP algorithm is finished with much smaller number 
of iterations compared to the OMP. We show that the gOMP can perfectly reconstruct any X-sparse 
signals (K > 1), provided that the sensing matrix satisfies the RIP with 5nk 

< We also 

demonstrate by empirical simulations that the gOMP has excellent recovery performance comparable 
to ^-minimization technique with fast processing speed and competitive computational complexity. 

Index Terms 

Compressed sensing (CS), orthogonal matching pursuit (OMP), generalized orthogonal matching 
pursuit (gOMP), restricted isometric property (RIP). 
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Generalized Orthogonal Matching Pursuit 



I. Introduction 

As a paradigm to acquire sparse signals at a rate significantly below Nyquist rate, compressive 
sensing has received much attention in recent years [fTj- ffTTl . The goal of compressive sensing 
is to recover the sparse vector using small number of linearly transformed measurements. The 
process of acquiring compressed measurements is referred to as sensing while that of recovering 
the original sparse signals from compressed measurements is called reconstruction. In the sensing 
operation, if-sparse signal vector x, i.e., n-dimensional vector having at most K non-zero 
elements, is transformed into m-dimensional measurements y via a matrix multiplication with 
<fr. This process is expressed as 

y = *x. (l) 

Since n > m for most of the compressive sensing scenarios, the system in © can be classified 
as an underdetermined system having more unknowns than observations, and hence, it is in 
general impossible to obtain an accurate reconstruction of the original input x using conventional 
"inverse" transform of <I>. Whereas, with a prior information on the signal sparsity and a condition 
imposed on <3>, x can be reconstructed by solving the t\ -minimization problem |[T6ll : 

min||x||i subject to <frx = y. (2) 

X 

A widely used condition of $ ensuring the exact recovery of x is called restricted isometric 
property (RIP) [fl2|. A sensing matrix <3> is said to satisfy the RIP of order K if there exists a 
constant 5 E (0,1) such that 

(l-6)\\x\\l<\\3>x\\ 2 2 <(l + 6)\\x\\l (3) 

for any fT-sparse vector x (||x|| < K). In particular, the minimum of all constants 5 satisfying 
© is referred to as isometry constant 5k- It is now well known that if §2K < V% — 1 lfl6l . 
then x can be perfectly recovered using ^-minimization. While £i-norm is convex and hence 
the problem can be solved via linear programming (LP) technique, the complexity associated 
with the LP is cubic in the size of the original vector to be recovered (i.e., 0(n 3 )) so that the 
complexity is burdensome for many applications. 
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Recently, greedy algorithms sequentially investigating the support of x have received consid- 
erable attention as cost effective alternatives of the LP approach. Algorithms in this category 
include orthogonal matching pursuit (OMP) [11 J, regularized OMP (ROMP) [0, stagewise OMP 
(StOMP) [|T3ll , subspace pursuit (SP) [18], and compressive sampling matching pursuit (CoSaMP) 
|fT9l . As a representative method in the greedy algorithm family, the OMP has been widely used 
due to its simplicity and competitive performance. Tropp and Gilbert [11 J showed that, for a 
.fT-sparse vector x and anmxn Gaussian sensing matrix <fr, the OMP recovers x from y = <frx 
with overwhelming probability, provided the number of measurements is m ~ Klogn. Wakin 
and Davenport showed that the OMP can reconstruct i^-sparse vector if 8k+i < [20], and 
Wang and Shim recently provided an improved condition 5k+i < EQ- 

The main principle behind the OMP is simple and intuitive: in each iteration, a column of 
$ maximally correlated with the residual is chosen (identification), the index of this column is 
added to the list (augmentation), and then the vestige of columns in the list is eliminated from the 
measurements, generating a new residual used for the next iteration (residual update). Among 
these, computational complexity of the OMP is dominated by the identification and the residual 
update steps. In the fc-th iteration, the identification requires a matrix- vector multiplication 
between <£' G M nxm and r fc_1 G M. m so that the number of floating point operations (flops) 
becomes (2m — l)n. Main operation of the residual update is to compute the estimate of x, 
which is completed by obtaining the LS solution and the required flops is approximately Akm 
when the modified Gram-Schmidt (MGS) algorithm is applied [221 . In addition, it requires 2km 
flops to perform the residual update. Considering that the algorithm requires K iterations, the 
total number of flops of the OMP is about 2Kmn + 3K 2 m. Clearly, the sparsity K plays 
an important role in the complexity of the OMP. In order to observe the effect of K on the 
complexity, we measure the running time of the OMP as a function of n for three distinct 
sparsity levels (— = 0.02, 0.1, and 0.2). As shown in Fig. [TJ the running time complexity for 
— = 0.2 is significantly larger than that for — = 0.02 since the cost of this orthogonalization 
process increases quadratically with the number of iterations. When the signal being recovered 
is not very sparse, therefore, the OMP may not be an excellent choice. 

There have been some studies on the modification, mainly on the identification step of the 
OMP, to improve the computational efficiency and recovery performance. In [fT3l . a method 
identifying more than one indices in each iteration was proposed. In this approach, referred to 



November 30, 2011 



DRAFT 



3 




n 



Fig. 1. Running time of the OMP as a function of n (m is set to the closest integer of 0.7n). The running time is the sum of 
1000 independent experiments measured using a matlab program under quad-core 64-bit processor and Windows 7 environment. 



as the StOMP, indices whose magnitude of correlation exceeds a deliberately designed threshold 
are chosen. It is shown that while achieving performance comparable to i\ -minimization, StOMP 
runs much faster than the OMP as well as t\ -minimization technique lfl3ll . In J21, another 
variation of the OMP, called ROMP, was proposed. After choosing a set of K indices with largest 
correlation in magnitude, ROMP narrows down the candidates by selecting a subset satisfying a 
predefined regularization rule. It is shown that the ROMP algorithm exactly recovers i^-sparse 
signals under 5 2K < 0.03/v/logiT 11251 

Our approach lies on the same ground of the StOMP and ROMP in the sense that we pursue 
the reduction of complexity and speeding-up the execution time of the algorithm by choos- 
ing multiple indices in each iteration. While previous efforts employ special treatment on the 
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identification step such as thresholding ['13 1 or regularization the proposed method pursues 
direct extension of the OMP by choosing indices corresponding to iV (> 1) largest correlation 
in magnitude. Our approach, henceforth referred to as generalized OMP (gOMP), is literally a 
generalization of the OMP and hence embraces the OMP as a special case (N = 1). Owing to the 
selection of multiple indices, multiple "correct" indices (i.e., indices in the support set) are added 
to the list and hence the algorithm is finished with much smaller number of iterations compared 
to the OMP. Indeed, in both empirical simulations and complexity analysis, we observe that the 
gOMP achieves substantial reduction in complexity with competitive reconstruction performance. 
The primary contributions of this paper are twofold: 

• We present an extension of the OMP, termed gOMP, for improving both complexity and 
recovery performance. By nature, the proposed gOMP lends itself to parallel processing. 
We show from empirical simulation that the recovery performance of the gOMP is much 
better than the OMP and also comparable to the LP technique as well as modified OMP 
algorithms (e.g., CoSaMP and StOMP). 

• We develop a perfect recovery condition of the gOMP. To be specific, we show that the 
RIP of order NK with 5 NK < r/^ m (K > 1) is sufficient for the gOMP to exactly 
recover any fT-sparse vector within K iterations (Theorem 13.81) . Also, as a special case of 
the gOMP, we show that a sufficient condition of the OMP is given by 5k+x < v ^ +1 - 

The rest of this paper is organized as follows. In Section II, we introduce the proposed gOMP 
algorithm. In Section III, we provide the RIP based analysis of the gOMP guaranteeing the 
perfect reconstruction of i^-sparse signals. We also revisit the OMP algorithm as a special case 
of the gOMP and develop a sufficient condition ensuring the recovery of fT-sparse signals. In 
Section IV, we provide empirical experiments on the reconstruction performance and complexity 
of the gOMP. Concluding remarks is given in Section V. 

We briefly summarize notations used in this paper. Let O = {1, 2, • • • , n} then T = { i \ i 6 
Q, Xi 7^ 0} denotes the support of vector x. For D C f2, \D\ is the cardinality of D. T — D = 
T\ (T fl D) is the set of all elements contained in T but not in D. x D E M} D \ is a restriction 
of the vector x to the elements with indices in D. &d £ M mx l D l is a submatrix of $ that only 
contains columns indexed by D. If & D is full column rank, then <3?t = (<&' D <& D )~ l <&' D is the 
pseudoinverse of span(& D ) is the span of columns in & D . P D = <fr D & D is the projection 
onto span{<&£,)- Pjj = I — Pd is the projection onto the orthogonal complement of span(Q>£)). 
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II. gOMP Algorithm 

In each iteration of the gOMP algorithm, correlations between columns of <3? and the modified 
measurements (residual) are compared and indices of the columns corresponding to N maximal 
correlation are chosen as elements of the estimated support set A k . Hence, when N = 1, gOMP 
returns to the OMP Denoting N indices as cp k {l ),-••, 4> k (N), the extended support set at k-th 
iteration becomes A fc = A fc_1 U {(p k (l),--- ,(p k (N)}. After obtaining the LS solution x A fc = 
argmin ||y — $ A fcx|L = <&t fe y, the residual r k is updated by subtracting $ A tx A fc from the 

X 

measurements y. In other words, the projection of y onto the orthogonal complement space of 
span(«3? A fc) becomes the new residual (i.e., r k = P^y). These operations are repeated until 
either the iteration number reaches maximum k max = mm(K, or the £ 2 -norm of the residual 
falls below a prespecified threshold (||r fe || 2 < e) (see Fig. 12). 

It is worth mentioning that the residual r fe of the gOMP is orthogonal to the columns of <fr A fc 
since 

<d> Afc ,r fe ) = <$ Afc ,P^y> (4) 

= $ Afc P A *y (5) 

= (Pi fc * Afc ) ; y=0 (7) 

where © follows from the symmetry of P^ fe (P^ fc = (P Afc ) ) and © is due to P^ fc $ A fe = 
(I — P A fc) $ A fe = <fr A fc — $ A fc$ Afc $ A fc = 0. 1 It is clear from this observation that indices in A fc 
cannot be re-selected in the succeeding iterations and the cardinality of A k becomes simply kN. 
When the iteration loop of the gOMP is finished, therefore, it is possible that the final support 
set A* contains indices not in T. Note that, even in this situation, final result is unaffected and 

'This property is satisfied only when 3? A fc has full column rank, which is true under k < S in the gOMP operation. 
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Fig. 2. Schematic diagram of the gOMP algorithm. 



the original signal is recovered because 

xa* = argmax ||y — <3?a*x|| 2 (8) 



X 



= $ A *y (9) 

= (^.SA-r^.^rXr (10) 

= (^.^a.)" 1 ^. ($a*x a * + $ a *-tX A *-t) (11) 

= x A *. (12) 

where (fTTI) follows from the fact that xa*-t = 0. From this observation, we deduce that as long 
as at least one correct index is found in each iteration of the gOMP, we can ensure that the 
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TABLE I 
gOMP Algorithm 

Input: measurements y, 

sensing matrix 
sparsity K, 

number of indices for each selection TV. 
Initialize: iteration count k = 0, 

residual vector r° = y, 
estimated support set A = 0. 

While ||r fe || 2 > e and k < min{A', m/TV} 
k = k+ 1. 

(Support elements selection) for i = 1, 2, • • • , TV, 

0(i) = arg max \(r k - 1 , ipj)\. 

3:jSn\{A*- 1 ,9l(i-l),- ,<p(l)} 

(Augmentation) A fe = A fc_1 U {(j>{l), ■■■ , 4>{N)}, 
(Estimation of x Afc ) x Afc = argmin |y — $ Afe x|L. 
(Residual Update) r k = y — <l? A ibX A fc. 

End 

Output: x = arg min ||y — <J?x|| 2 . 

x:supp(x)— A fc 



original signal is perfectly recovered within K iterations. In practice, however, the number of 
correct indices being selected is usually more than one so that required number of iterations is 
much smaller than K. 

The whole procedure of the gOMP algorithm is summarized in Table. HI 

III. RIP based Recovery Condition Analysis 

In this section, we analyze the RIP based condition under which the gOMP can perfectly 
recover K-sparse signals. First, we analyze the condition ensuring a success at the first iteration 
(k = 1). Success means that at least one correct index is chosen in the iteration. Next, we study 
the condition ensuring the success in the non-initial iteration (k > 1). Combining two conditions, 
we obtain the sufficient condition of the gOMP algorithm guaranteeing the perfect recovery of 
i^-sparse signals. Following lemmas are useful in our analysis. 
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Lemma 3.1 (Lemma 3 in KT2\l . /fiffl/)." If the sensing matrix satisfies the RIP of both orders 
K 1 and K 2 , then 

for any K\ < K 2 . This property is referred to as the monotonicity of the isometric constant. 
Lemma 3.2 (Consequences of RIP KT2H . /fi9l/): For I C O, if 8m < 1 then for any u G 



(1 - <Vl) H U ll2 < ll*/*HI 2 < (1 + ll U ll 2) 

—L- ||u|| 2 < IK^j^uHa < 
1 + di/i 1 - din 



i - o\i\ 

Lemma 3.3 (Lemma 2.1 in ftfflj and Lemma 1 in KTW): Let I\ , L 2 C be two disjoint sets 
(Ji n J 2 = 0). If 5,^1+1^1 < 1, then 

ll $ /i $u ll 2 = ll*/i*4» u 4»ll 2 - fyil+lfcllMla 
holds for any u supported on I 2 . 

A. Condition for Success at the Initial Iteration 

As mentioned, if at least one index is correct among N indices selected, we say that the 
gOMP makes a success in the iteration. The following theorem provides a sufficient condition 
guaranteeing the success of the gOMP in the first iteration. 

Theorem 3.4: Suppose x G K" is a A'-sparse signal, then the gOMP algorithm makes a 
success in the first iteration if 



/A 

8k+n < —7=r 7=. (13) 



Proof: Let A 1 denote the set of A indices chosen in the first iteration. Then, elements of 
$^ x y are A significant elements in <fr'y and thus 



<fr' A1 y|| 2 = max E|(^, y >| 2 (14) 



ie-f 
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where ipt denotes the z-th column in <fr. Further, we have 



= -L||^y|| 2 . (18) 



This together with y = <3> T x r , we have 



/ N 

Idylls > v^II*t*tx t || 2 (19) 

> \/^(l-^)||x|| 2 (20) 

where (l20l) is from Lemma 13^21 

On the other hand, when no correct index is chosen in the first iteration (i.e., A 1 fl T = 0), 

ll*Aiy|l 2 = ll $ Ai $ TXr|| 2 < ^+jv||x|| 2 , (21) 

where the inequality follows from Lemma l33l The last inequality contradicts (l20l) if 

iV 
K 




5 K+N \\x\\ 2 < \ — (1 - 5 K ) ||x|| 2 . (22) 



Note that, under (1221) . at least one correct index is chosen in the first iteration. Since 5k < 5k+n 
by Lemma 13.11 (1221) holds true when 



N 

5 K+N \\x\\ 2 < <Wa) ||x|| 2 . (23) 



Equivalently, 



In summary, if ^a'+a < ^^ / ^ , then A 1 contains at least one element of T in the first iteration 
of the gOMP. ■ 



5 K+N < v (24) 
+ ViV 
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Q 



T (true support set) 




(estimated support set) 



Fig. 3. Set diagram of fi, T, and A fc . 

B. Condition for Success in Non-initial Iterations 

In this subsection, we investigate the condition guaranteeing the success of the gOMP in 
non-initial iterations. 

Theorem 3.5: Suppose the gOMP has performed k iterations (1 < k < K) successfully. Then 
under the condition 



As mentioned, newly selected iV indices are not overlapping with previously selected ones and 
hence |A fc | = kN. Also, under the hypothesis that gOMP has performed k iterations successfully, 
A fc contains at least k correct indices. Therefore, the number of correct indices / in A k becomes 



Note that we only consider the case where A k does not include all correct indices (/ < K) 
since otherwise the reconstruction task is already finished. 2 Hence, as depicted in Fig. [3l the set 
containing the rest of the correct indices is nonempty (T — A k ^ 0). 

Key ingredients in our proof are 1) the upper bound of A-th largest correlation in 
magnitude between r k and columns indexed by F = f2\(A fe U T) (i.e., the set of remaining 

2 When all the correct indices are chosen (T C A fc ) then the residual r fe = and hence the gOMP algorithm is finished 
already. 




(25) 



the gOMP will make a success at the {k + l)-th condition. 



I = ITH A fc | > k. 
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incorrect indices) and 2) the lower bound fii of the largest correlation in magnitude between r k 
and columns whose indices belong to T — A k . If 0x is larger than czn, then 0x is contained in 
the top iV among all values of | (yjj, r fc ) | and hence at least one correct index is chosen in the 
(k + l)-th iteration. 

The following two lemmas provides the upper bound of a N and the lower bound of /3\, 
respectively. 

Lemma 3.6: Let = | ((psm, r fe ) | where 4>(i) = arg max |(^-,r fe )| so that ctj 

j:jeF\{cl>(l),-,<j>(i-l)} 

are ordered in magnitude (a± > «2 > •••)• Then, in the (k + l)-th iteration in the gOMP 
algorithm, a^, the iV-th largest correlation in magnitude between r k and {y?j} ie i?, satisfies 

^ fc . 8 N+Nk 5 Nk+K -l \ ||x T _ A fc|| 2 

«iv < o N+K _i H = — . (26) 



1 - 5 Nk J JN 



Proof: See Appendix lAl 



Lemma 3.7: Let &: = \(<P6(i\,Y k )\ where <b(i) = are max (<£>,-, r fe ) so that 

^ are ordered in magnitude > > • • • )• Then in the (k + l)-th iteration in the gOMP 
algorithm, f3±, the largest correlation in magnitude between r fc and {y?j}j e T-A fe > satisfies 

h ~\ K -' " (T^fef " l+s '-V W=T <27) 

Proof: See Appendix |B] ■ 

Proof of Theorem 13.51 

Proof: A sufficient condition under which at least one correct index is selected at the 
(k + l)-th step can be described as 

a N <(3 1 . (28) 

Noting that 1 < k < I < K and 1 < N < K and also using the monotonicity of the restricted 
isometric constant in Lemma 13.11 we have 

K-l< NK — > §K-l < &NK, 

Nk + K -I < NK -> 5 Nk +K-i < S NK , 
Nk < NK -» 5 Nk < 5 NK , 
N + Nk < NK -> 5 N+Nk < 5 NK . (29) 
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From Lemma [3T61 and (|29l ), we have 

^ f r . 5 N+Nk 5 N k+K-l\ ll x T-A fc ll 2 ^ f x &NK \ ll X T-A fe 2 

OiN < ON+K-l H : 7 7= < "NK + Z 7 7= • (30) 

V l~$Nk ) VN \ 1-5 NK J VN 

Also, from Lemma [3771 and (|29l ), we have 

Pi > ( 1 - dif-J - — ; -jOArfe+^.fc 



1 + $NK j-2 \ II X T-A fc ll2 

Using (1301 ) and (f3Tb , we obtain the sufficient condition of (1281 ) as 

1 + 2 \ ||x T _ A fc|| 2 / \ ||x T _ A fc|| 2 



> ( i - s NK - 77 A ) J s==^- (3D 



After some manipulations, we have 



N 

S NK < — \ =. (33) 



N 

S NK < * - 7 = , (34) 



Since ^fW^l < VW, dH holds if 

^ + 2^ 

which completes the proof. ■ 

C. Overall Sufficient Condition 

Thus far, we investigated conditions guaranteeing the success of the gOMP algorithm in the 
initial iteration (k = 1) and non-initial iterations (k > 1). We now combine these results to 
describe the sufficient condition of the gOMP algorithm ensuring the perfect recovery of K- 
sparse signals. 

Recall from Theorem 13.41 that the gOMP makes a success in the first iteration if 



JN 

$k+n < —f=r 7=- (35) 

1C + VN 



Also, recall from Theorem 13.51 that if the previous k iterations were successful, then the gOMP 
will be successful for the (k + l)-th iteration if 



JN 

&NK < —f=t 7=- (36) 

^ + 2^ 



The overall sufficient condition is determined by the stricter condition between (1351) and (|36l) . 



November 30, 2011 



DRAFT 



13 



Theorem 3.8 (Sufficient condition of gOMP): For any A-sparse vector x, the gOMP algo- 
rithm perfectly recovers x from y = <3?x via at most K iterations if the sensing matrix <f> 
satisfies the RIP with isometric constant 

*»* < v£h for K>1, (37) 
5 2 < \ for K = 1. (38) 

Proof: In order to prove the theorem, following three cases need to be considered. 
. Case 1 [N>1,K > 1]: 

In this case, NK > K + N and hence 5nk > 5k+n and also /7 /^ f /T7 > / — . Thus, 

\K-\-yN V K+2y JS 



361) is stricter than (1351) and the general condition becomes 



8 NK < V (39) 
VK + 2VN 



Case 2 [N = 1, K > 1]: 

In this case, the general condition should be the stricter condition between 8k < -7= — and 

b ^ VK+2 

5 K +i < v ^ +1 - Unfortunately, since 5k < 8k+i and < v ^ +1 , one cannot compare 

two conditions directly. As an indirect way, we borrow a sufficient condition guaranteeing 
the perfect recovery of the gOMP for iV = 1 as 



8k < • W 



Readers are referred to [24] for the proof of Since < VK ^ K for K > 1, the 

sufficient condition for Case 2 becomes 

'* * 7WT2 <41) 



Nice feature of (I4TT) is that it can be nicely combined with the result of Case 1 since 
applying N = 1 in ([39]) results in (I4TT) . 
Case 3 [A" = 1]: 

Since x has a single nonzero element (K = 1), x should be recovered in the first iteration. 
Let u be the index of nonzero element, then the exact recovery of x is ensured regardless 
of iV if 

\(fu,y)\ = max|(<pi,y)| . (42) 
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The condition ensuring (1421) is obtained by applying N = K = 1 for Theorem 13.41 and is 
given by 6 2 < \. 

■ 

Remark 1 (Related to the measurement size of sensing matrix): It is well known that m x n 
random sensing matrix with i.i.d. entries with Gaussian distribution N(l,l/m) obey the RIP 
with 5k < £ with overwhelming probability if the dimension of the measurements satisfies IfTTl 

m = { i > ) ■ (43) 

Plugging (1371) into (|43l) . we have 

m = O [K 2 log j^j . (44) 

Note that the same result can be obtained for the OMP by plugging 5k+i < mt0 mto 

(SI. 



D. Sufficient Condition of OMP 

In this subsection, we put our focus on the OMP algorithm which is the special case of the 
gOMP algorithm for N — 1. For sure, one can immediately obtain the condition of the OMP 
8k < -rb by applying N = 1 to Theorem 13.81 Our result, slightly improved version of this, 
is based on the fact that the non-initial step of the OMP process is the same as the initial step 
since the residual is considered as a new measurement preserving the sparsity K of an input 
vector x [[24]] . In this perspective, a condition guaranteeing to select a correct index in the first 
iteration is extended to the general condition without occurring any loss. 

Corollary 3.9 (A direct sequence of Theorem I3.4D : Suppose x 6 1™ is i^-sparse, then the 
OMP algorithm recovers an index in T from y = <frx E W 71 in the first iteration if 

S K +i < 7= - . (45) 
VK+ 1 

We now state that the first iteration condition is extended to any iteration of the OMP algorithm. 

Lemma 3.10 (Wang and Shim /[271/): Suppose that the first k iterations (1 < k < K — 1) of 
the OMP algorithm are successful (i.e., A k C T), then the (k + l)-th iteration is also successful 
(i.e., t k+1 G T) under 5 K+ i < 
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Proof: The residual at the /c-th iteration of the OMP is expressed as 

r fc = y-$ Afc x Afc . (46) 

Since y = $7X7 and also <fr A fc is a submatrix of <&r under hypothesis, r fe 6 span (<&t)- Hence, 
r fc can be expressed as a linear combination of the |T| (= IT) columns of <& T and can be 
expressed as 

r k = <&x 

where the support of x' is contained in the support of x. In other words, r k is a measurement of 
.fT-sparse signal x' using the sensing matrix From this observation together with the corollary 
13 .91 we conclude that if A k 6 T, then the index chosen in (k + l)-th iteration is an element of 
T under (03]). ■ 
Combining Corollary 13.91 and Lemma 13.101 and also noting that indices in A fc is not selected 
again in the succeeding iterations (i.e., the index chosen in [k + l)-th step belongs to T — A fc ), 
A K = T and the OMP algorithm recovers original signal x in exactly K iterations under 

6k +i < VW+i' 

Following theorem formally describes the sufficient condition of the OMP algorithm. 
Theorem 3.11 (Wang and Shim /f271/): Suppose x is i^-sparse vector, then the OMP algorithm 
recovers x from y = <frx under 

< 7 ^ TT - (47) 
Proof: Immediate from Corollary 13.91 and Lemma 13. 101 ■ 



IV. Simulations and Discussions 

A. Simulations Setup 

In this section, we empirically demonstrate the effectiveness of the gOMP in recovering the 
sparse signals. Perfect recovery conditions in the literatures (including the condition of the gOMP 
in this paper) usually offer too strict sufficient condition so that empirical performance has been 
served as a supplementary measure in many works. In particular, empirical frequency of exact 
reconstruction has been a popularly used tool to measure the effectiveness of recovery algorithm 
ED, ll25i By comparing the maximal sparsity level at which the perfect recovery is ensured 
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(this point is often called critical sparsity), superiority of the reconstruction algorithm can be 
evaluated. In our simulations, following algorithms are considered. 

1) LP technique for solving £ 1 -minimization (http://cvxr.com/cvx/). 

2) OMP algorithm. 

3) gOMP algorithim (N = 5). 

4) StOMP with false alarm control (FAC) based thresholding (http://sparselab.stanford.edu/). 3 

5) ROMP algorithm (http://www.cmc.edu/pages/faculty/DNeedell/index.html). 

6) CoSaMP algorithm (http://www.cmc.edu/pages/faculty/DNeedell/index.html). 

In each trial, we construct mxn (m = 128 and n = 256) sensing matrix $ with entries drawn 
independently from Gaussian distribution A/"(0, 1/m). In addition, we generate fT-sparse signal 
vector x whose support is chosen at random. We consider two types of sparse signals; Gaussian 
signals and pulse amplitude modulation (PAM) signals. Each nonzero element of Gaussian signals 
is drawn from standard Gaussian and that in PAM signals is randomly chosen from the set 
{±1, ±3}. In each recovery algorithm, we perform 1000 independent trials and plot the empirical 
frequency of exact reconstruction. 

B. Simulation Results 

In Fig. SI we provide the recovery performance as a function of the sparsity level K. Clearly, 
higher critical sparsity implies better empirical reconstruction performance. The simulation re- 
sults reveal that the critical sparsity of the gOMP algorithm is much better than ROMP, OMP, 
and StOMP. Even compared to the LP technique and CoSaMP, the gOMP exhibits a bit improved 
recovery performance. Fig. [5] provides results for the PAM input signals. We observe that overall 
behavior is similar to the case of Gaussian signals except that the t\ -minimization is slightly 
better than the gOMP. Overall, we can clearly see that the gOMP algorithm is very competitive 
for both Gaussian and PAM input scenarios. 

C. Complexity of gOMP 

In this subsection, we discuss the computational complexity of the gOMP algorithm. Com- 
plexity for each step of the gOMP algorithm is summarized as follows. 

3 Since FAC scheme outperforms false discovery control (FDC) scheme, we exclusively use FAC scheme in our simulations. 
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Sparsity 

Fig. 4. Reconstruction performance for /^-sparse Gaussian signal vector as a function of sparsity K. 



• Support elements selection: The gOMP performs a matrix-vector multiplication <&'r fe_1 , 
which needs (2m — l)n flops (m multiplication and m — 1 additions). Also, <J>'r fc 1 needs 
to be sorted to find N best indices, which requires nN — N(N + l)/2 flops. 

• Estimation of x A fc: In this step, the LS solution is obtained using the MGS algorithm. 
Using the QR factorization (3>A fc — QR-X we nave 

x Afc = (^^A*)" 1 $ Afc y = (R'Rr 1 R'Q'y 

By recycling the part of the QR factorization of ^a*- 1 computed in the previous iteration, 
the LS solutions can be solved efficiently (see Appendix O for details). As a result, the LS 
solution can be obtained with a cost of 4N 2 km + (-2N 2 + 5N)m + 2N 3 k 2 + (-4iV 3 + 
5N 2 )k + 3N 3 -N 2 -N. 
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Sparsity 

Fig. 5. Reconstruction performance for 7\"-sparse PAM signal vector as a function of sparsity K. 

• Residual update: For updating residual, the gOMP performs the matrix-vector multiplica- 
tion <E» A fcX A fc ((2Nk — l)m flops) followed by the subtraction (m flops). 
Table [II] summarizes the complexity of the gOMP in each iteration. If the gOMP is finished 
in S iterations, then the complexity of the gOMP, denoted as C 9 omp(N, S,m,n), becomes 

C g0 Mp(N, S, m, n) « 2Smn + (2N 2 + N)S 2 m. (48) 

Noting that S < mm{K, m/N} and N is a small constant, the complexity of the gOMP is upper 
bounded by O(Kmn). In practice, however, the iteration number of the gOMP is much smaller 
than K due to the parallel processing of multiple correct indices, which saves the complexity of 
the gOMP substantially. Indeed, as shown in Fig. [6] the number of iterations is only about 1/3 
of the OMP so that the gOMP has an advantage over the OMP in both complexity and running 
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TABLE II 

Complexity of the gOMP algorithm 



Step 


Running time 


Support elements selection 


(2m - 1 + N)n - N(N + l)/2 = 0{mn) 


Estimation of x A t 


» 4N 2 km = O(fcm) 


Residual update 


2Nkm 


Total cost of the fc-th iteration 


w 2mn + (4iV 2 + 2N)km = 0{mn) 




Fig. 6. Number of iterations of the OMP and gOMP (N = 5) as a function of sparsity K. 



time. 
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20 25 
Sparsity K 



Fig. 7. Running time as a function of sparsity K. Note that the running time of the l\ -minimization is not in the figure since 
the time is more than order of magnitude higher than the time of other algorithms. 



D. Running time 

In Fig. [7J the running time (average of Gaussian and PAM signals) for recovery algorithms 
is provided. The running time is measured using the MATLAB program under quad-core 64-bit 
processor and Window 7 environments. Note that we do not add the result of LP technique simply 
because the running time is more than order of magnitude higher than that of all other algorithms. 
Overall, we observe that the running time of StOMP, CoSaMP, gOMP, and OMP is more or less 
similar when K is small (i.e., the signal vector is sparse). However, when the signal vector is less 
sparse (i.e., when K is large), the running time of the OMP and CoSaMP increases much faster 
than that of the gOMP and StOMP. In particular, while the running time of the OMP, StOMP, 
and gOMP increases linearly over K, that for the CoSaMP seems to increase quadratically over 
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K. The reason is that the CoSaMP should compute completely new LS solution over the distinct 
subset of <3> and hence previous QR factorization cannot be recycled |fT9ll . Also, it is interesting 
to observe that the running time of the gOMP and StOMP is fairly comparable. Considering the 
fact that the thresholding (FAC or FDC) is required for each iteration of the StOMP, the gOMP 
might be a bit favorable in implementation. 

V. Conclusion 

As a cost-effective solution for recovering sparse signals from compressed measurements, 
the OMP algorithm has received much attention in recent years. In this paper, we presented 
the generalized version of the OMP algorithm for pursuing efficiency in reconstructing sparse 
signals. Since multiple indices can be identified with no additional postprocessing operation, the 
proposed gOMP algorithm lends itself to parallel processing, which expedites the processing of 
the algorithm and thereby reduces the running time. In fact, we demonstrated in the empirical 
simulation that the gOMP has excellent recovery performance comparable to i\ -minimization 
technique with fast processing speed and competitive computational complexity. Also, we showed 
from the RIP analysis that if the isometry constant of the sensing matrix satisfies 5nk < ^/j^^ 
then the gOMP algorithm can perfectly recover K-sparse signals (K > 1) from compressed 
measurements. One important point we would like to mention is that the gOMP algorithm is 
potentially more effective than what this analysis tells. Indeed, the bound in (ITTT) is derived 
based on the worst case scenario where the algorithm selects only one correct index per iteration 
(hence requires maximum i^-th iterations). In reality, as observed in the empirical simulations, 
it is highly likely that multiple correct indices are identified for each iteration and hence the 
number of iterations is much smaller than that of the OMP. Therefore, we believe that less strict 
or probabilistic analysis will uncover the whole story of the CS recovery performance. Our future 
work will be directed towards this avenue. 
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Appendix A 
Proof of Lemma [376] 

Proof: Let io, be the i-th largest correlation in magnitude between r fc and {ifj}jeF (i- e -> 
columns corresponding to remaining incorrect indices). Also, define the set of indices W = 
{wi,w 2 , ■ ■ ■ ,w N }. The £ 2 -norm of the correlation <&' w r k is expressed as 

ll*V rfe || 2 = ||^V P A^T-AfcX T _ Afc || 2 

= ||*W*T-A fcX T-A* ~ *V P A fc ^T-A fcX T-A fc ll 2 

< ||$ , H/ ^T-AfcX T _ Afc || 2 + ||$V P A**T-A*X T _ Afc || 2 , (49) 

where = I - P A *. Since W and T — A fc are disjoint (i.e., W n (T - A fc ) = 0) and 
| W| + |T — A fc | = A + A — I (note that the number of correct indices in A k is I by hypothesis). 
This together with Lemma 13.31 we have 

\\& w <fr T _ A kX T _ Ak \\ 2 < 5 A r +A -_ i ||x T _ A fc || 2 . (50) 

Similarly, noting that W D A fc = and |W| + |A fc | = A + Afc, we have 



|$V P A^r-A*x T _ Afc || 2 < 8 



N+Nk 



where 



Afe <P T _ A fcX T _ A fc 



IS 



A fc^T-A fcX T-A fc 



< 



($ Afe $ Afc ) V^Sy _ A fcX T _ A fc 



1-5 



Nk 



A fcV T _ A feX T _ A fc|| 2 



(51) 

(52) 
(53) 

lr -T-A fe ll2> (^4) 

where (1531 and (1541 follow from Lemma [3721 and Lemma [3731 respectively. Since A fc and T — A k 
are disjoint, if the number of correct indices in A k is I, then |A fe U (T — A fc ) | = A&; + K — I. 
Using (Bob . (Bib , and ([541), we have 



, £>Nk+K-l 
< ~ l|X 



N+K-l 



+ 



i -T—A k II 2 • 



(55) 



Since = ((y^, r fc )|, we have \\® w r k 



1 — <^Affc 
N 

XTJ a i- Now, using the norm inequality 4 , we have 



i=i 



> 



1 " 

i=i 



(56) 
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Since ai > a 2 > • • • > a N , it is clear that 



N 



W- 1 1 2 

Combining (1551) and (1571) , we have 

ON+Nk°Nk+K-l 



and hence 



1-5 



^ ||x T _ A fc|| 2 > y/Na N , 



®N < &N+K-1 + 



SN+Nk^Nk+K-l\ II X T-A fc ll2 



1-5 



Nk 



N 



(57) 



(58) 



(59) 



Appendix B 
Proof of Lemma 1X71 

Proof: Since r fe = y - * Afe $ Afe y = P^y, we have 

ll„fcll 2 



(P^y)'P^y = (Pl^r-A^-A^'Piy- 



Employing the idempotency and symmetry properties of the operator P^ fc (i.e. 
and P^ fc = (P^ fc y), we further have 

W^lll = ( $ T-AfcX T _ Afc )'PA fc y = |<$ T _ Afc X T _ Afc ,r fc )| . 

Noting that <fr T _ A fcX T _ A fc = x j¥ji we further have 

jeT-A fc 



\j£T-A k I 



\ x j I 



jeT-A fc 

Since /?i is the largest correlation in magnitude between r k and {>£j}jeT-A k > it is clear that 



(60) 



|<Vi,r*>|<A 



for all j ET — A . Applying this to (|60l) . we obtain 

ll'llll- l^il ^ = H X T-Afc|li/3l- 

jeT-A fe 

Noting that the dimension of x T _ A fe is K — I, using the norm inequality 



(61) 



(62) 



l x T-A fc lli < V K — Z||x T _ 



A fc II 2' 
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we have 

||r fc ||2< VA^Uxr^Uaft. (63) 

In addition, noting that r fc = P^ fc (<fr A fcX A fc + <fr T _ A fcX T _ A fc) and P^ fc $ A fcX A fe = 0, 1 1 r fc 1 1 ^ can 
be rewritten as 

W^Wl = \\ P A k ^T-A k ^T-A k \\l 

— \\^T-A kJ< -T-A k II2 ~~ l|PA fc ^T-A feX T-A* II2 • 

Using the definition of RIP, we get 

||$ T _ A fcX T _ Afc ||2 > (1 -S K -i) ||x T _ Afc ||2. (64) 

On the other hand, 

||P A fc$T-Afc X T-A*ll2 = ^A fc (*A^AO _1$ Afe $ T-AfeX T „ A fc 

< (l + S Nk ) ($ Afc $ Afc ) _1 * Afc *T-AfeX T _ Afc 



(65) 
(66) 



1 + 5Nk 111/ ^ ,,2 t f ~i \ 

- 7, r^2 ll*A**T-A*Xr-A*ll 2 ( 67 ) 

(1 - djvfc) 

^fc+A--t(l + <W 11 |,2 , Afi v 

- Pi X ^2 II X T-AHI 2 > ( 68 ) 

(1 - d Afc ) 2 

where (|66l ) is from the definition of RIP and (1671) and (|68l) follow from Lemma 13.21 and 13 .31 
respectively (A fc and T - A k are disjoint sets and | A fc U (T - A fe ) | = Nk + K - I). 
Combing ([64]) and (1681) , we have 

ll r Il2 ^^-^-i (l-S Nk )' )\\ X T-A k \\ 2 - 

From (|63l) and (|69l>. 

( 1 - <fjc_j - + 6 x Nk . 2 S 2 Nk+K _i) \\x T -A k \\l < VK - J||x T _ A * || 2 /3i, (70) 

V [l-ONk) J 

and hence 

> (\ A i + frvfc x2 \ Il x r-A fc ll2 nu 

01 - V ~ 6k - 1 ~ (i^f w-ij ( 71 ) 
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Appendix C 

Computational cost for the "estimate" step of gOMP 
In the k-th iteration, the gOMP estimates the nonzero elements of x by solving an LS problem, 



x Afe = argmin ||y - * A *x|| 2 = $ Afc y = ($ Afc $ Afc ) $ Afc y. 



(72) 



To solve (1721) . we employ the MGS algorithm in which the QR decomposition of previous 
iteration is maintained and, therefore, the computational cost can be reduced. Without loss of 
generality, we assume <fr A fc — [ipx ip 2 • " fNk ) • The QR decomposition of $> A k is given by 



® A k = QR (73) 
mxNk cons j S (; S f jy/j orthonormal columns obtained via 



where Q = ( qi q 2 • ■ ■ q^) G 

MGS algorithm, and R G JH NkxNk is an upper triangular matrix, 

^(qi^i) (q2,v ? 2) ••■ (q.i,<PNk)\ 

„ (q 2 ,<^2) ••• (q.2,¥Nk) 

\ 1 ••• {q N k,^Nk)J 

For notation simplicity we denote Rij = (q^, cpj) and p = N(k — 1). In addition, we denote the 
QR decomposition of the (k — l)-th iteration as <fr A fc-i = Q iR i. Then it is clear that 

R-l Ra 

R b 

omxA^ ^ an( j gj ven by 



Q = (Q-i Qo) and R = ^ 



(74) 



where Q = (q p+1 ■ • ■ q Nk ^ G 



p+l • • • R\,Nk 



R a 



\Rp,p+i 



R 



and R b 



Rp+l,p+l ■ ■ ■ Rp+l,Nk 



\ 



o 



(75) 



Nk,Nk J 



Applying d73j) to ([72]), we have 



x Afe = (RR) 1 R'Q'y. 



(76) 

We count the cost of solving (|76l) in the following steps. Here we assess cost in the classical 
sense of counting floating-point operations (flops), i.e., each +, — , *, /, y j~ counts as one flop. 
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Cost of QR decomposition: 

To obtain Q and R, one only needs to compute Q , R a and R b since the previous data, 
Q_i and R i, are stored. For j = 1 to N, we sequentially calculate 

{Ri,p+j}i=l,2,-,p+j-l — {vli> <pj)si=l,2,-,p+j-li (77) 

p+j-l 

qp+i = Pp+j — Riv+jQii (78) 

i=l 

q P+ , = TF^V, (79) 
Rp+j,p+j — (cip+jiVp+j) ■ (80) 

Taking j = 1 for example. One first computes {i?i, p +i}i=i,2,--- , P using Q i (requires p{2m — 
1) flops) and then computes q p+ i = (p p+ i — YTi=i Ri,p+i°Li (requires 2mp flops). Then, 
normalizing q p+ i needs 3m flops. Finally, computing R p+ i tP+ i requires 2m — 1 flops. The 
cost of this example amounts to Amp + 5m — p — 1. Similarly, one can calculate the other 
data in Q and ^R a R b j . In summary, the cost for this QR factorization becomes 

C QR = AN 2 mk - 2mN 2 + 3mN - N 2 k + -N 2 - -N. (81) 

2 2 



Cost of calculating Q'y 

Since Q = (q_x Q )> we have 




(82) 



By reusing the data Q'^y, (f82l) is solved with 

d = N{2m - 1). 

• Cost of calculating R'Q'y: 
Applying R' to the vector Q'y, we have 



R'Q'y =1 R -i Q -i y I. (83) 



R^Q' lY + R'bQ^ 
Since the data R' X Q' x y can be reused, ([83]) is solved with 

C 2 = 2N 2 k - N 2 . 
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Cost of calculating (R R) 1 

Since R is an upper triangular matrix, 

(RR) -1 = (RT'R- 1 . (84) 

Applying the block matrix inversion, 

R _i = /R-i ^/(R x) 1 -(R_ 1 )- 1 R a (R b )-A (g5) 

\ K h ) \ (R b ) 1 j 

Then we calculate (RR) -1 = (R') _1 R _1 , i.e., 

(RR)- = ( (^i)- 1 ^)- 1 -(R'-J-^R-O-RaCRO-A 
^-(RD^R^R'.J-^R-i)- 1 (R'J-^Rb)- 1 ; 

We can reuse the data (R / _ 1 ) _1 (R_i) _1 so that the cost of calculating (R b ) _1 , (R b ) _1 (R b ) _1 , 
and -(R'_ 1 )- 1 (R_ 1 )- 1 R a (R b )- 1 becomes N{N + l)(2iV + l)/3 (Gaussian Elimination 
method), N(N + l)(2N + l)/6, and 2N 3 k 2 - AN 3 k + 2iV 3 , respectively. The cost for 
computing (R'R) -1 is 

C 3 = 2N 3 k 2 - AN 3 k + 3N 3 + -N 2 + -N. 

Cost of calculating x A fc = (R'R) 1 R'Q'y 
Applying (R'R) -1 to the vector R'Q'y, we obtain 

/ (RLJ-^R-O^R-xQ'-xy + &\ 

x Afc = (87) 

V 6 + 6 J 

where 

6 = -(R'_ 1 )- 1 (R_ 1 )- 1 R a (R b )- 1 R' a Q'_ 1 y + R' b Q' y, 

6 = -(R^-RUR-J-^R-^^RxQ'-xy, 

6 = (R / b )- 1 (R b )- 1 R a Q / _ 1 y + R / b Q[ ) y. 

We can reuse (R / _ 1 ) _1 (R_i) _1 R / _ 1 Q / _ 1 y so that calculating £i, £ 2 and £ 3 need (2iV — 
l)JV(fc - 1), (2N(k - 1) - l)iV and (2iV - 1)N flops, respectively. The cost of this step 
becomes 

C A = AN 2 k - 2N 2 . 
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In summary, whole cost of solving LS problem in the A;-th iteration of gOMP is the sum of 
the above and is given by 

C-ls — C-qr + Ci + C 2 + C 3 + C 4 

= AN 2 km + (-2N 2 + 5N)m + 2N 3 k 2 + (-AN 3 + 5N 2 )k + 3N 3 - N 2 - N. 
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